-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Proper masks for padding with custom pad value #1185
Merged
+21
−11
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 15, 2025
ghstack-source-id: 0580f89ce9bbaf5a13bab33f9c9b8f5a9e9df96f Pull Request resolved: #1185
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 15, 2025
vmoens
added a commit
that referenced
this pull request
Jan 15, 2025
ghstack-source-id: 0580f89ce9bbaf5a13bab33f9c9b8f5a9e9df96f Pull Request resolved: #1185
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 46.0150μs | 21.0085μs | 47.5999 KOps/s | 49.5006 KOps/s | |
test_plain_set_stack_nested | 82.5130μs | 21.3020μs | 46.9440 KOps/s | 49.0733 KOps/s | |
test_plain_set_nested_inplace | 68.1370μs | 23.0831μs | 43.3217 KOps/s | 46.0904 KOps/s | |
test_plain_set_stack_nested_inplace | 75.6510μs | 22.9213μs | 43.6275 KOps/s | 46.5680 KOps/s | |
test_items | 49.7330μs | 4.2052μs | 237.7995 KOps/s | 236.1933 KOps/s | |
test_items_nested | 0.5831ms | 0.4013ms | 2.4920 KOps/s | 2.5599 KOps/s | |
test_items_nested_locked | 0.7123ms | 0.3992ms | 2.5052 KOps/s | 2.5563 KOps/s | |
test_items_nested_leaf | 0.1672ms | 78.4579μs | 12.7457 KOps/s | 12.9929 KOps/s | |
test_items_stack_nested | 0.5683ms | 0.4015ms | 2.4905 KOps/s | 2.5525 KOps/s | |
test_items_stack_nested_leaf | 0.1575ms | 81.4893μs | 12.2716 KOps/s | 13.0697 KOps/s | |
test_items_stack_nested_locked | 0.6115ms | 0.4001ms | 2.4995 KOps/s | 2.5230 KOps/s | |
test_keys | 41.9180μs | 3.4276μs | 291.7536 KOps/s | 276.7120 KOps/s | |
test_keys_nested | 0.2876ms | 0.1633ms | 6.1252 KOps/s | 6.1345 KOps/s | |
test_keys_nested_locked | 2.0667ms | 0.1692ms | 5.9096 KOps/s | 5.8916 KOps/s | |
test_keys_nested_leaf | 0.2656ms | 0.1443ms | 6.9302 KOps/s | 6.9882 KOps/s | |
test_keys_stack_nested | 0.2612ms | 0.1617ms | 6.1836 KOps/s | 6.1160 KOps/s | |
test_keys_stack_nested_leaf | 0.2549ms | 0.1410ms | 7.0924 KOps/s | 7.0025 KOps/s | |
test_keys_stack_nested_locked | 0.2617ms | 0.1683ms | 5.9420 KOps/s | 5.8847 KOps/s | |
test_values | 8.6722μs | 1.0454μs | 956.5683 KOps/s | 953.2344 KOps/s | |
test_values_nested | 0.1143ms | 63.0279μs | 15.8660 KOps/s | 16.2558 KOps/s | |
test_values_nested_locked | 0.1185ms | 63.0339μs | 15.8645 KOps/s | 16.2182 KOps/s | |
test_values_nested_leaf | 0.1242ms | 71.8792μs | 13.9122 KOps/s | 14.3820 KOps/s | |
test_values_stack_nested | 0.1098ms | 63.5127μs | 15.7449 KOps/s | 16.1471 KOps/s | |
test_values_stack_nested_leaf | 0.1247ms | 71.4445μs | 13.9969 KOps/s | 13.5777 KOps/s | |
test_values_stack_nested_locked | 0.1191ms | 63.2642μs | 15.8067 KOps/s | 16.1012 KOps/s | |
test_membership | 55.9310μs | 0.8673μs | 1.1531 MOps/s | 1.1402 MOps/s | |
test_membership_nested | 39.8250μs | 2.8262μs | 353.8352 KOps/s | 346.4926 KOps/s | |
test_membership_nested_leaf | 44.3520μs | 2.8806μs | 347.1464 KOps/s | 347.1271 KOps/s | |
test_membership_stacked_nested | 34.7250μs | 2.8212μs | 354.4608 KOps/s | 350.7223 KOps/s | |
test_membership_stacked_nested_leaf | 40.8360μs | 2.8554μs | 350.2141 KOps/s | 344.5498 KOps/s | |
test_membership_nested_last | 41.7670μs | 4.2543μs | 235.0573 KOps/s | 229.1790 KOps/s | |
test_membership_nested_leaf_last | 43.1110μs | 4.3173μs | 231.6286 KOps/s | 225.3108 KOps/s | |
test_membership_stacked_nested_last | 42.9500μs | 5.0110μs | 199.5618 KOps/s | 231.3741 KOps/s | |
test_membership_stacked_nested_leaf_last | 47.2770μs | 5.0788μs | 196.8974 KOps/s | 227.0917 KOps/s | |
test_nested_getleaf | 49.0820μs | 10.5699μs | 94.6081 KOps/s | 93.7361 KOps/s | |
test_nested_get | 47.4580μs | 9.9901μs | 100.0990 KOps/s | 98.7137 KOps/s | |
test_stacked_getleaf | 46.3170μs | 10.4901μs | 95.3283 KOps/s | 95.5808 KOps/s | |
test_stacked_get | 51.5160μs | 9.9829μs | 100.1714 KOps/s | 99.6017 KOps/s | |
test_nested_getitemleaf | 52.6980μs | 11.0708μs | 90.3278 KOps/s | 89.9414 KOps/s | |
test_nested_getitem | 47.8990μs | 10.6348μs | 94.0309 KOps/s | 93.5253 KOps/s | |
test_stacked_getitemleaf | 55.5440μs | 11.0171μs | 90.7679 KOps/s | 85.1366 KOps/s | |
test_stacked_getitem | 38.6920μs | 10.5459μs | 94.8233 KOps/s | 87.1899 KOps/s | |
test_lock_nested | 4.6409ms | 0.4507ms | 2.2187 KOps/s | 1.7701 KOps/s | |
test_lock_stack_nested | 0.8073ms | 0.4162ms | 2.4027 KOps/s | 2.3864 KOps/s | |
test_unlock_nested | 1.0750ms | 0.3751ms | 2.6658 KOps/s | 2.6520 KOps/s | |
test_unlock_stack_nested | 0.6126ms | 0.3389ms | 2.9505 KOps/s | 2.9654 KOps/s | |
test_flatten_speed | 0.1929ms | 0.1015ms | 9.8478 KOps/s | 10.1419 KOps/s | |
test_unflatten_speed | 0.9168ms | 0.5194ms | 1.9253 KOps/s | 1.9650 KOps/s | |
test_common_ops | 1.7686ms | 0.7987ms | 1.2521 KOps/s | 1.2780 KOps/s | |
test_creation | 56.3250μs | 2.4601μs | 406.4941 KOps/s | 407.0420 KOps/s | |
test_creation_empty | 52.3570μs | 12.7853μs | 78.2151 KOps/s | 86.1559 KOps/s | |
test_creation_nested_1 | 43.9520μs | 15.5759μs | 64.2016 KOps/s | 69.5487 KOps/s | |
test_creation_nested_2 | 57.6470μs | 20.5014μs | 48.7771 KOps/s | 52.5012 KOps/s | |
test_clone | 91.4300μs | 13.3411μs | 74.9565 KOps/s | 76.0282 KOps/s | |
test_getitem[int] | 1.3876ms | 12.8388μs | 77.8891 KOps/s | 78.1949 KOps/s | |
test_getitem[slice_int] | 0.1394ms | 23.9976μs | 41.6708 KOps/s | 40.7639 KOps/s | |
test_getitem[range] | 0.2964ms | 46.5292μs | 21.4919 KOps/s | 19.9627 KOps/s | |
test_getitem[tuple] | 0.1308ms | 19.9856μs | 50.0362 KOps/s | 50.0060 KOps/s | |
test_getitem[list] | 0.2058ms | 41.3968μs | 24.1564 KOps/s | 23.1713 KOps/s | |
test_setitem_dim[int] | 51.0150μs | 25.2102μs | 39.6665 KOps/s | 38.6785 KOps/s | |
test_setitem_dim[slice_int] | 82.3530μs | 50.1661μs | 19.9338 KOps/s | 19.2640 KOps/s | |
test_setitem_dim[range] | 0.1349ms | 72.7233μs | 13.7508 KOps/s | 13.4753 KOps/s | |
test_setitem_dim[tuple] | 81.1410μs | 40.4188μs | 24.7410 KOps/s | 24.8898 KOps/s | |
test_setitem | 0.1412ms | 20.9239μs | 47.7921 KOps/s | 49.1914 KOps/s | |
test_set | 97.7720μs | 20.1464μs | 49.6367 KOps/s | 50.5530 KOps/s | |
test_set_shared | 9.0176ms | 0.1735ms | 5.7633 KOps/s | 5.9606 KOps/s | |
test_update | 0.3081ms | 23.2935μs | 42.9305 KOps/s | 44.5555 KOps/s | |
test_update_nested | 0.2023ms | 33.3671μs | 29.9696 KOps/s | 30.7382 KOps/s | |
test_update__nested | 0.3569ms | 33.0831μs | 30.2270 KOps/s | 29.1636 KOps/s | |
test_set_nested | 0.1304ms | 22.1240μs | 45.1998 KOps/s | 45.7239 KOps/s | |
test_set_nested_new | 0.1406ms | 26.7976μs | 37.3168 KOps/s | 38.4600 KOps/s | |
test_select | 0.2425ms | 43.1543μs | 23.1727 KOps/s | 23.3315 KOps/s | |
test_select_nested | 0.1310ms | 63.9141μs | 15.6460 KOps/s | 15.6835 KOps/s | |
test_exclude_nested | 0.1760ms | 82.4527μs | 12.1282 KOps/s | 12.3011 KOps/s | |
test_empty[True] | 0.8286ms | 0.4101ms | 2.4384 KOps/s | 2.4860 KOps/s | |
test_empty[False] | 10.5423μs | 1.3868μs | 721.0973 KOps/s | 736.1673 KOps/s | |
test_unbind_speed | 0.3606ms | 0.2685ms | 3.7248 KOps/s | 3.7375 KOps/s | |
test_unbind_speed_stack0 | 0.5842ms | 0.2604ms | 3.8408 KOps/s | 3.7881 KOps/s | |
test_unbind_speed_stack1 | 0.1164s | 0.8025ms | 1.2462 KOps/s | 1.3515 KOps/s | |
test_split | 1.7770ms | 1.5710ms | 636.5263 Ops/s | 558.2888 Ops/s | |
test_chunk | 0.1159s | 1.9591ms | 510.4384 Ops/s | 566.2080 Ops/s | |
test_consolidate_njt[False-None] | 9.7298ms | 8.2390ms | 121.3745 Ops/s | 120.7094 Ops/s | |
test_creation[device0] | 0.2776ms | 90.5429μs | 11.0445 KOps/s | 10.4328 KOps/s | |
test_creation_from_tensor | 3.6522ms | 93.3657μs | 10.7106 KOps/s | 10.6138 KOps/s | |
test_add_one[memmap_tensor0] | 0.1366ms | 4.9204μs | 203.2357 KOps/s | 204.9534 KOps/s | |
test_contiguous[memmap_tensor0] | 13.1750μs | 0.5053μs | 1.9792 MOps/s | 1.9605 MOps/s | |
test_stack[memmap_tensor0] | 30.1460μs | 3.4121μs | 293.0782 KOps/s | 296.0159 KOps/s | |
test_memmaptd_index | 0.9522ms | 0.2345ms | 4.2638 KOps/s | 4.3586 KOps/s | |
test_memmaptd_index_astensor | 0.8008ms | 0.3206ms | 3.1189 KOps/s | 3.1492 KOps/s | |
test_memmaptd_index_op | 0.9889ms | 0.6002ms | 1.6662 KOps/s | 1.7314 KOps/s | |
test_serialize_model | 0.1239s | 0.1131s | 8.8381 Ops/s | 8.5717 Ops/s | |
test_serialize_model_pickle | 0.4652s | 0.3922s | 2.5494 Ops/s | 2.5302 Ops/s | |
test_serialize_weights | 0.1242s | 0.1157s | 8.6418 Ops/s | 8.6640 Ops/s | |
test_serialize_weights_returnearly | 0.2778s | 0.1799s | 5.5578 Ops/s | 6.3170 Ops/s | |
test_serialize_weights_pickle | 0.4452s | 0.4106s | 2.4352 Ops/s | 1.1240 Ops/s | |
test_serialize_weights_filesystem | 0.1482s | 0.1420s | 7.0439 Ops/s | 7.0488 Ops/s | |
test_serialize_model_filesystem | 0.1527s | 0.1486s | 6.7301 Ops/s | 6.8614 Ops/s | |
test_reshape_pytree | 61.0540μs | 26.3199μs | 37.9940 KOps/s | 38.0728 KOps/s | |
test_reshape_td | 0.1078ms | 32.1075μs | 31.1454 KOps/s | 31.2876 KOps/s | |
test_view_pytree | 61.0140μs | 25.9129μs | 38.5908 KOps/s | 38.3957 KOps/s | |
test_view_td | 93.7350μs | 37.9609μs | 26.3429 KOps/s | 26.9222 KOps/s | |
test_unbind_pytree | 83.5060μs | 29.2090μs | 34.2360 KOps/s | 34.6768 KOps/s | |
test_unbind_td | 0.3314ms | 38.8822μs | 25.7187 KOps/s | 25.8764 KOps/s | |
test_split_pytree | 92.8720μs | 29.1982μs | 34.2486 KOps/s | 33.9707 KOps/s | |
test_split_td | 0.4972ms | 44.4478μs | 22.4983 KOps/s | 22.2038 KOps/s | |
test_add_pytree | 83.6360μs | 34.9884μs | 28.5809 KOps/s | 28.8340 KOps/s | |
test_add_td | 0.1493ms | 56.3495μs | 17.7464 KOps/s | 17.5377 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1288ms | 61.8692μs | 16.1631 KOps/s | 16.1862 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4550ms | 0.1722ms | 5.8075 KOps/s | 5.7937 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1005ms | 45.1566μs | 22.1451 KOps/s | 22.3914 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2139ms | 0.1163ms | 8.5983 KOps/s | 8.5337 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 68.6780μs | 26.6209μs | 37.5645 KOps/s | 38.7658 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1176ms | 58.7174μs | 17.0307 KOps/s | 17.1987 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1682ms | 76.8531μs | 13.0118 KOps/s | 13.2115 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1559ms | 65.8325μs | 15.1901 KOps/s | 15.1424 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1810ms | 0.1054ms | 9.4882 KOps/s | 9.5683 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4457ms | 0.2146ms | 4.6594 KOps/s | 4.7043 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1379ms | 45.7560μs | 21.8551 KOps/s | 21.6204 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5542ms | 65.9426μs | 15.1647 KOps/s | 15.2218 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1916ms | 0.1023ms | 9.7797 KOps/s | 9.8306 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4102ms | 0.1974ms | 5.0658 KOps/s | 5.0237 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4131ms | 0.2290ms | 4.3673 KOps/s | 4.3736 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2265ms | 0.1061ms | 9.4284 KOps/s | 9.5469 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1707ms | 63.3245μs | 15.7917 KOps/s | 16.3713 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.3193ms | 47.7311μs | 20.9507 KOps/s | 21.9257 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2343ms | 0.1559ms | 6.4147 KOps/s | 6.4497 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2331ms | 0.1033ms | 9.6798 KOps/s | 9.8608 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 69.9100μs | 22.3550μs | 44.7327 KOps/s | 46.9303 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.2869ms | 67.5124μs | 14.8121 KOps/s | 15.2980 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1506ms | 77.6085μs | 12.8852 KOps/s | 13.1535 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1324ms | 66.3184μs | 15.0788 KOps/s | 15.1914 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3046ms | 0.2042ms | 4.8968 KOps/s | 4.9074 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.5266ms | 1.3371ms | 747.8985 Ops/s | 774.5594 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2859ms | 0.1997ms | 5.0082 KOps/s | 5.0065 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3579ms | 0.7655ms | 1.3063 KOps/s | 1.2877 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5494ms | 0.4448ms | 2.2484 KOps/s | 2.2590 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.2138ms | 2.6803ms | 373.0994 Ops/s | 371.6976 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 81.0210μs | 35.4650μs | 28.1968 KOps/s | 28.5007 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5185ms | 31.8056μs | 31.4410 KOps/s | 30.7241 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 78.5960μs | 29.2247μs | 34.2176 KOps/s | 35.7381 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 96.6600μs | 22.8575μs | 43.7493 KOps/s | 44.5551 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1694ms | 29.9139μs | 33.4293 KOps/s | 34.7054 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 77.8350μs | 22.6794μs | 44.0929 KOps/s | 44.7910 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1121ms | 50.4940μs | 19.8043 KOps/s | 18.8860 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6032ms | 19.4896μs | 51.3094 KOps/s | 49.7320 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1599ms | 43.4524μs | 23.0137 KOps/s | 22.8368 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1926ms | 18.9412μs | 52.7950 KOps/s | 54.8608 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1123ms | 44.3471μs | 22.5494 KOps/s | 22.5929 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 61.4340μs | 18.3054μs | 54.6288 KOps/s | 54.7954 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1760ms | 51.1564μs | 19.5479 KOps/s | 19.0155 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0827ms | 19.5639μs | 51.1146 KOps/s | 51.0027 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.4071ms | 44.5739μs | 22.4347 KOps/s | 22.5492 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1074ms | 18.1178μs | 55.1943 KOps/s | 54.7130 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1204ms | 44.1631μs | 22.6433 KOps/s | 22.5915 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.6194ms | 18.5438μs | 53.9265 KOps/s | 54.6683 KOps/s | |
test_mod_add[eager] | 0.1131ms | 35.2952μs | 28.3324 KOps/s | 28.8804 KOps/s | |
test_mod_add[compile] | 0.1367ms | 48.2519μs | 20.7246 KOps/s | 21.0344 KOps/s | |
test_mod_add[compile-overhead] | 0.1227ms | 47.0711μs | 21.2444 KOps/s | 21.0278 KOps/s | |
test_mod_wrap[eager] | 0.3490ms | 0.2255ms | 4.4354 KOps/s | 4.4899 KOps/s | |
test_mod_wrap[compile] | 0.3121ms | 0.2061ms | 4.8512 KOps/s | 4.7742 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4077ms | 0.2079ms | 4.8106 KOps/s | 4.8838 KOps/s | |
test_mod_wrap_and_backward[eager] | 16.1705ms | 12.2468ms | 81.6542 Ops/s | 82.0330 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.2347ms | 13.0018ms | 76.9125 Ops/s | 72.8814 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 16.0613ms | 13.0281ms | 76.7571 Ops/s | 71.8965 Ops/s | |
test_seq_add[eager] | 0.1913ms | 0.1147ms | 8.7153 KOps/s | 8.4881 KOps/s | |
test_seq_add[compile] | 0.1398ms | 62.4292μs | 16.0181 KOps/s | 15.9808 KOps/s | |
test_seq_add[compile-overhead] | 0.1232ms | 61.1392μs | 16.3561 KOps/s | 16.4510 KOps/s | |
test_seq_wrap[eager] | 0.7458ms | 0.4547ms | 2.1992 KOps/s | 2.0770 KOps/s | |
test_seq_wrap[compile] | 0.3475ms | 0.2271ms | 4.4035 KOps/s | 4.3745 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3830ms | 0.2254ms | 4.4375 KOps/s | 4.3684 KOps/s | |
test_func_call_runtime[False-eager] | 1.0185ms | 0.5386ms | 1.8567 KOps/s | 1.9001 KOps/s | |
test_func_call_runtime[False-compile] | 0.8285ms | 0.4243ms | 2.3569 KOps/s | 2.3695 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7870ms | 0.4237ms | 2.3599 KOps/s | 2.3542 KOps/s | |
test_func_call_runtime[True-eager] | 1.6216ms | 0.7542ms | 1.3258 KOps/s | 1.3375 KOps/s | |
test_func_call_runtime[True-compile] | 0.5825ms | 0.4634ms | 2.1582 KOps/s | 2.1351 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5856ms | 0.4622ms | 2.1633 KOps/s | 2.1403 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.6431ms | 0.5305ms | 1.8849 KOps/s | 1.9120 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7388ms | 0.4197ms | 2.3825 KOps/s | 2.3845 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.8783ms | 0.4223ms | 2.3682 KOps/s | 2.3775 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2038ms | 0.8781ms | 1.1388 KOps/s | 1.1179 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9016ms | 0.4856ms | 2.0593 KOps/s | 2.0547 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6210ms | 0.4879ms | 2.0498 KOps/s | 2.0061 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.8580ms | 1.9313ms | 517.7922 Ops/s | 528.3363 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9895ms | 0.5170ms | 1.9341 KOps/s | 1.9198 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0842ms | 0.5280ms | 1.8939 KOps/s | 1.9329 KOps/s | |
test_distributed | 0.3601ms | 0.1232ms | 8.1157 KOps/s | 7.7630 KOps/s | |
test_tdmodule | 0.1094ms | 25.7210μs | 38.8787 KOps/s | 36.6700 KOps/s | |
test_tdmodule_dispatch | 78.8370μs | 48.4898μs | 20.6229 KOps/s | 20.2295 KOps/s | |
test_tdseq | 48.3900μs | 28.2696μs | 35.3737 KOps/s | 34.2147 KOps/s | |
test_tdseq_dispatch | 86.7620μs | 54.2254μs | 18.4415 KOps/s | 18.4378 KOps/s | |
test_instantiation_functorch | 2.5960ms | 1.5173ms | 659.0593 Ops/s | 667.2560 Ops/s | |
test_exec_functorch | 0.3980ms | 0.1756ms | 5.6956 KOps/s | 5.5660 KOps/s | |
test_exec_functional_call | 0.2851ms | 0.1687ms | 5.9268 KOps/s | 5.8303 KOps/s | |
test_exec_td_decorator | 0.5106ms | 0.2286ms | 4.3739 KOps/s | 4.2672 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.1458ms | 0.6695ms | 1.4937 KOps/s | 1.5345 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8515ms | 0.6651ms | 1.5034 KOps/s | 1.5297 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8383ms | 0.5422ms | 1.8445 KOps/s | 1.8985 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8648ms | 0.5389ms | 1.8555 KOps/s | 1.9034 KOps/s | |
test_to_module_speed[True] | 1.5534ms | 1.3350ms | 749.0888 Ops/s | 754.1608 Ops/s | |
test_to_module_speed[False] | 1.7269ms | 1.2997ms | 769.4336 Ops/s | 772.9486 Ops/s | |
test_tc_init | 87.8330μs | 45.9937μs | 21.7421 KOps/s | 21.1106 KOps/s | |
test_tc_init_nested | 0.2799ms | 91.7538μs | 10.8987 KOps/s | 10.7793 KOps/s | |
test_tc_first_layer_tensor | 18.7250μs | 1.5080μs | 663.1437 KOps/s | 632.7027 KOps/s | |
test_tc_first_layer_nontensor | 28.4730μs | 4.5700μs | 218.8185 KOps/s | 213.7197 KOps/s | |
test_tc_second_layer_tensor | 43.9620μs | 2.7990μs | 357.2725 KOps/s | 338.9781 KOps/s | |
test_tc_second_layer_nontensor | 35.1360μs | 5.9999μs | 166.6708 KOps/s | 165.6052 KOps/s | |
test_unbind | 0.2434s | 13.9177ms | 71.8510 Ops/s | 74.7209 Ops/s | |
test_full_like | 20.7625ms | 15.8492ms | 63.0945 Ops/s | 79.1131 Ops/s | |
test_zeros_like | 13.7654ms | 8.1500ms | 122.7001 Ops/s | 135.1131 Ops/s | |
test_ones_like | 12.8388ms | 8.5374ms | 117.1311 Ops/s | 135.3900 Ops/s | |
test_clone | 13.4490ms | 10.6682ms | 93.7368 Ops/s | 107.2054 Ops/s | |
test_squeeze | 66.9850μs | 12.0372μs | 83.0758 KOps/s | 85.0955 KOps/s | |
test_unsqueeze | 0.3142ms | 90.3824μs | 11.0641 KOps/s | 10.9628 KOps/s | |
test_split | 0.3447ms | 0.1875ms | 5.3325 KOps/s | 5.0888 KOps/s | |
test_permute | 0.3125ms | 0.1985ms | 5.0387 KOps/s | 5.0405 KOps/s | |
test_stack | 31.4457ms | 28.0184ms | 35.6909 Ops/s | 39.2796 Ops/s | |
test_cat | 35.0638ms | 27.3414ms | 36.5745 Ops/s | 40.7860 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 41.4310μs | 10.9992μs | 90.9154 KOps/s | 75.4151 KOps/s | |
test_plain_set_stack_nested | 38.0010μs | 11.3136μs | 88.3893 KOps/s | 73.7871 KOps/s | |
test_plain_set_nested_inplace | 58.6710μs | 12.2312μs | 81.7582 KOps/s | 68.7226 KOps/s | |
test_plain_set_stack_nested_inplace | 35.5110μs | 12.2871μs | 81.3864 KOps/s | 68.7982 KOps/s | |
test_items | 0.1762ms | 2.8925μs | 345.7197 KOps/s | 340.2915 KOps/s | |
test_items_nested | 0.5574ms | 0.3693ms | 2.7075 KOps/s | 2.7359 KOps/s | |
test_items_nested_locked | 0.5735ms | 0.3721ms | 2.6875 KOps/s | 2.7706 KOps/s | |
test_items_nested_leaf | 0.2444ms | 58.7061μs | 17.0340 KOps/s | 16.9734 KOps/s | |
test_items_stack_nested | 0.5604ms | 0.3693ms | 2.7075 KOps/s | 2.7451 KOps/s | |
test_items_stack_nested_leaf | 0.1568ms | 58.4080μs | 17.1209 KOps/s | 17.0618 KOps/s | |
test_items_stack_nested_locked | 0.4421ms | 0.3658ms | 2.7337 KOps/s | 2.7524 KOps/s | |
test_keys | 32.4600μs | 3.4643μs | 288.6567 KOps/s | 289.9186 KOps/s | |
test_keys_nested | 0.1801ms | 87.2976μs | 11.4551 KOps/s | 11.3935 KOps/s | |
test_keys_nested_locked | 0.7575ms | 93.4704μs | 10.6986 KOps/s | 10.5761 KOps/s | |
test_keys_nested_leaf | 0.1239ms | 78.5533μs | 12.7302 KOps/s | 12.7056 KOps/s | |
test_keys_stack_nested | 0.1431ms | 88.5967μs | 11.2871 KOps/s | 11.3242 KOps/s | |
test_keys_stack_nested_leaf | 0.1227ms | 80.1693μs | 12.4736 KOps/s | 12.6800 KOps/s | |
test_keys_stack_nested_locked | 0.1547ms | 94.5318μs | 10.5785 KOps/s | 10.6400 KOps/s | |
test_values | 4.9852μs | 0.8489μs | 1.1781 MOps/s | 1.1794 MOps/s | |
test_values_nested | 64.6110μs | 37.7796μs | 26.4693 KOps/s | 26.2035 KOps/s | |
test_values_nested_locked | 66.6310μs | 39.1765μs | 25.5255 KOps/s | 25.2670 KOps/s | |
test_values_nested_leaf | 70.6810μs | 41.9586μs | 23.8330 KOps/s | 23.4916 KOps/s | |
test_values_stack_nested | 0.1448ms | 38.0709μs | 26.2668 KOps/s | 26.4032 KOps/s | |
test_values_stack_nested_leaf | 75.1710μs | 42.2917μs | 23.6453 KOps/s | 23.6554 KOps/s | |
test_values_stack_nested_locked | 78.3920μs | 39.7936μs | 25.1297 KOps/s | 25.2648 KOps/s | |
test_membership | 2.7406μs | 0.5084μs | 1.9670 MOps/s | 1.9722 MOps/s | |
test_membership_nested | 17.1700μs | 1.9873μs | 503.1910 KOps/s | 501.9855 KOps/s | |
test_membership_nested_leaf | 19.1855μs | 1.9831μs | 504.2615 KOps/s | 504.3797 KOps/s | |
test_membership_stacked_nested | 37.8400μs | 2.0777μs | 481.3091 KOps/s | 479.2067 KOps/s | |
test_membership_stacked_nested_leaf | 69.3510μs | 2.0555μs | 486.4967 KOps/s | 485.8151 KOps/s | |
test_membership_nested_last | 41.2010μs | 3.0811μs | 324.5561 KOps/s | 322.5163 KOps/s | |
test_membership_nested_leaf_last | 38.9610μs | 3.1176μs | 320.7583 KOps/s | 320.3574 KOps/s | |
test_membership_stacked_nested_last | 43.5310μs | 5.9595μs | 167.7996 KOps/s | 319.9951 KOps/s | |
test_membership_stacked_nested_leaf_last | 41.1100μs | 5.9093μs | 169.2249 KOps/s | 323.7048 KOps/s | |
test_nested_getleaf | 42.9200μs | 6.0842μs | 164.3610 KOps/s | 164.2049 KOps/s | |
test_nested_get | 37.9210μs | 5.8463μs | 171.0480 KOps/s | 171.9833 KOps/s | |
test_stacked_getleaf | 73.6610μs | 6.1108μs | 163.6445 KOps/s | 163.3511 KOps/s | |
test_stacked_get | 39.1100μs | 5.8326μs | 171.4505 KOps/s | 172.2528 KOps/s | |
test_nested_getitemleaf | 33.4500μs | 6.4230μs | 155.6894 KOps/s | 153.5065 KOps/s | |
test_nested_getitem | 62.4110μs | 6.1041μs | 163.8237 KOps/s | 161.2509 KOps/s | |
test_stacked_getitemleaf | 30.1700μs | 6.4137μs | 155.9159 KOps/s | 156.6211 KOps/s | |
test_stacked_getitem | 53.3010μs | 6.1195μs | 163.4118 KOps/s | 164.0252 KOps/s | |
test_lock_nested | 9.0682ms | 0.3802ms | 2.6305 KOps/s | 2.6194 KOps/s | |
test_lock_stack_nested | 0.4460ms | 0.3393ms | 2.9470 KOps/s | 2.8901 KOps/s | |
test_unlock_nested | 0.6221ms | 0.3132ms | 3.1930 KOps/s | 3.1609 KOps/s | |
test_unlock_stack_nested | 0.3336ms | 0.2781ms | 3.5954 KOps/s | 3.5070 KOps/s | |
test_flatten_speed | 0.1206ms | 74.6966μs | 13.3875 KOps/s | 13.2455 KOps/s | |
test_unflatten_speed | 0.5005ms | 0.3170ms | 3.1545 KOps/s | 3.1428 KOps/s | |
test_common_ops | 1.6711ms | 0.5643ms | 1.7721 KOps/s | 1.5141 KOps/s | |
test_creation | 0.1697ms | 1.7410μs | 574.3782 KOps/s | 568.7501 KOps/s | |
test_creation_empty | 41.1310μs | 6.6196μs | 151.0656 KOps/s | 96.8788 KOps/s | |
test_creation_nested_1 | 32.2800μs | 8.1986μs | 121.9727 KOps/s | 81.7785 KOps/s | |
test_creation_nested_2 | 36.1210μs | 11.0945μs | 90.1350 KOps/s | 67.4625 KOps/s | |
test_clone | 0.1701ms | 10.0265μs | 99.7361 KOps/s | 99.0541 KOps/s | |
test_getitem[int] | 1.3798ms | 10.6954μs | 93.4978 KOps/s | 93.5741 KOps/s | |
test_getitem[slice_int] | 0.2038ms | 20.7319μs | 48.2349 KOps/s | 48.8096 KOps/s | |
test_getitem[range] | 0.1460ms | 36.2525μs | 27.5843 KOps/s | 26.9204 KOps/s | |
test_getitem[tuple] | 0.1096ms | 18.0826μs | 55.3016 KOps/s | 54.8481 KOps/s | |
test_getitem[list] | 0.2204ms | 32.6562μs | 30.6221 KOps/s | 30.2562 KOps/s | |
test_setitem_dim[int] | 28.1010μs | 18.8203μs | 53.1340 KOps/s | 51.5494 KOps/s | |
test_setitem_dim[slice_int] | 59.3610μs | 37.5900μs | 26.6028 KOps/s | 25.3518 KOps/s | |
test_setitem_dim[range] | 0.1451ms | 52.4939μs | 19.0498 KOps/s | 18.0261 KOps/s | |
test_setitem_dim[tuple] | 67.7010μs | 31.1358μs | 32.1174 KOps/s | 29.9632 KOps/s | |
test_setitem | 99.3410μs | 13.3235μs | 75.0554 KOps/s | 58.8323 KOps/s | |
test_set | 0.1289ms | 12.9149μs | 77.4301 KOps/s | 61.3566 KOps/s | |
test_set_shared | 1.4161ms | 0.1508ms | 6.6315 KOps/s | 6.5481 KOps/s | |
test_update | 0.8238ms | 14.9109μs | 67.0650 KOps/s | 51.4220 KOps/s | |
test_update_nested | 0.1596ms | 20.2695μs | 49.3353 KOps/s | 40.0178 KOps/s | |
test_update__nested | 1.2059ms | 25.0171μs | 39.9726 KOps/s | 40.7990 KOps/s | |
test_set_nested | 81.7210μs | 14.1102μs | 70.8705 KOps/s | 61.0158 KOps/s | |
test_set_nested_new | 83.1920μs | 16.8220μs | 59.4461 KOps/s | 53.4716 KOps/s | |
test_select | 0.1183ms | 28.2974μs | 35.3389 KOps/s | 32.4458 KOps/s | |
test_select_nested | 75.1110μs | 44.1641μs | 22.6428 KOps/s | 22.6823 KOps/s | |
test_exclude_nested | 96.6410μs | 64.3323μs | 15.5443 KOps/s | 15.8307 KOps/s | |
test_empty[True] | 0.3323ms | 0.2981ms | 3.3544 KOps/s | 3.3630 KOps/s | |
test_empty[False] | 3.5270μs | 0.8223μs | 1.2162 MOps/s | 1.2080 MOps/s | |
test_to | 86.4410μs | 56.3966μs | 17.7316 KOps/s | 17.3436 KOps/s | |
test_to_nonblocking | 0.1966ms | 48.0851μs | 20.7964 KOps/s | 20.9050 KOps/s | |
test_unbind_speed | 1.7699ms | 0.2332ms | 4.2877 KOps/s | 4.2746 KOps/s | |
test_unbind_speed_stack0 | 0.3026ms | 0.2341ms | 4.2711 KOps/s | 4.2339 KOps/s | |
test_unbind_speed_stack1 | 92.5435ms | 0.6601ms | 1.5149 KOps/s | 1.4998 KOps/s | |
test_split | 93.2348ms | 1.5874ms | 629.9486 Ops/s | 585.6014 Ops/s | |
test_chunk | 95.4882ms | 1.5921ms | 628.0959 Ops/s | 628.2792 Ops/s | |
test_consolidate[False-None] | 96.1129ms | 2.8978ms | 345.0897 Ops/s | 372.5370 Ops/s | |
test_consolidate[default-None] | 2.1082ms | 1.6783ms | 595.8288 Ops/s | 599.4027 Ops/s | |
test_consolidate[reduce-overhead-None] | 2.1087ms | 1.6973ms | 589.1618 Ops/s | 585.4651 Ops/s | |
test_consolidate_njt[False-None] | 7.0363ms | 6.4948ms | 153.9685 Ops/s | 154.1966 Ops/s | |
test_to[False-False-None] | 1.8711ms | 1.7061ms | 586.1337 Ops/s | 578.4900 Ops/s | |
test_to[True-False-None] | 1.5499ms | 1.3076ms | 764.7463 Ops/s | 785.1028 Ops/s | |
test_to[within-False-None] | 4.3476ms | 4.0976ms | 244.0466 Ops/s | 248.2087 Ops/s | |
test_to[True-default-None] | 5.6986ms | 5.3530ms | 186.8103 Ops/s | 193.0260 Ops/s | |
test_to_njt[False-False-None] | 7.0204ms | 6.8539ms | 145.9025 Ops/s | 146.3887 Ops/s | |
test_to_njt[True-False-None] | 5.8694ms | 5.4891ms | 182.1800 Ops/s | 182.6964 Ops/s | |
test_to_njt[within-False-None] | 12.3338ms | 12.1806ms | 82.0977 Ops/s | 82.5426 Ops/s | |
test_creation[device0] | 0.3729ms | 79.4100μs | 12.5929 KOps/s | 12.6256 KOps/s | |
test_creation_from_tensor | 0.4928ms | 82.2974μs | 12.1511 KOps/s | 11.9948 KOps/s | |
test_add_one[memmap_tensor0] | 0.4220ms | 6.1668μs | 162.1583 KOps/s | 162.7507 KOps/s | |
test_contiguous[memmap_tensor0] | 2.9950μs | 0.4191μs | 2.3861 MOps/s | 2.4382 MOps/s | |
test_stack[memmap_tensor0] | 28.7800μs | 4.2193μs | 237.0055 KOps/s | 235.3079 KOps/s | |
test_memmaptd_index | 1.7260ms | 0.2510ms | 3.9840 KOps/s | 3.9295 KOps/s | |
test_memmaptd_index_astensor | 0.9257ms | 0.3176ms | 3.1487 KOps/s | 3.1786 KOps/s | |
test_memmaptd_index_op | 1.0141ms | 0.5470ms | 1.8281 KOps/s | 1.6269 KOps/s | |
test_serialize_model | 0.1313s | 0.1304s | 7.6659 Ops/s | 7.5979 Ops/s | |
test_serialize_model_pickle | 1.3487s | 1.1892s | 0.8409 Ops/s | 0.8234 Ops/s | |
test_serialize_weights | 0.1315s | 0.1302s | 7.6778 Ops/s | 7.6223 Ops/s | |
test_serialize_weights_returnearly | 0.4409s | 68.0247ms | 14.7005 Ops/s | 23.6801 Ops/s | |
test_serialize_weights_pickle | 1.3778s | 1.1978s | 0.8349 Ops/s | 0.8202 Ops/s | |
test_reshape_pytree | 73.4410μs | 22.0602μs | 45.3305 KOps/s | 44.9672 KOps/s | |
test_reshape_td | 65.1510μs | 28.7273μs | 34.8101 KOps/s | 36.0503 KOps/s | |
test_view_pytree | 82.4110μs | 23.4057μs | 42.7246 KOps/s | 45.6447 KOps/s | |
test_view_td | 0.1843ms | 33.2403μs | 30.0840 KOps/s | 30.8309 KOps/s | |
test_unbind_pytree | 0.1877ms | 28.1007μs | 35.5863 KOps/s | 36.2503 KOps/s | |
test_unbind_td | 0.8355ms | 35.7880μs | 27.9423 KOps/s | 27.6622 KOps/s | |
test_split_pytree | 68.1910μs | 30.3431μs | 32.9565 KOps/s | 33.4827 KOps/s | |
test_split_td | 1.0628ms | 42.1616μs | 23.7182 KOps/s | 25.3323 KOps/s | |
test_add_pytree | 0.1628ms | 34.2571μs | 29.1910 KOps/s | 30.1979 KOps/s | |
test_add_td | 0.2052ms | 46.3508μs | 21.5746 KOps/s | 19.4523 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2377ms | 0.1193ms | 8.3834 KOps/s | 7.6477 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2803ms | 0.1327ms | 7.5361 KOps/s | 7.2824 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1451ms | 94.2275μs | 10.6126 KOps/s | 10.2550 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.7098ms | 0.1476ms | 6.7769 KOps/s | 6.2800 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1569ms | 22.6662μs | 44.1185 KOps/s | 41.9671 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 82.3820μs | 29.4154μs | 33.9958 KOps/s | 34.1579 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4608ms | 64.4105μs | 15.5254 KOps/s | 15.6496 KOps/s | |
test_compile_copy_nested[pytree-eager] | 94.8010μs | 49.3518μs | 20.2627 KOps/s | 20.3449 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2008ms | 0.1415ms | 7.0659 KOps/s | 7.0422 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3148ms | 0.2192ms | 4.5618 KOps/s | 4.5528 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2568ms | 96.7418μs | 10.3368 KOps/s | 9.7367 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2050ms | 55.6310μs | 17.9756 KOps/s | 16.9781 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2870ms | 0.1346ms | 7.4303 KOps/s | 7.3241 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6156ms | 0.4691ms | 2.1318 KOps/s | 1.9558 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4581ms | 0.2614ms | 3.8249 KOps/s | 3.7794 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2869ms | 0.1434ms | 6.9753 KOps/s | 6.8557 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2172ms | 66.5657μs | 15.0228 KOps/s | 14.1344 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1349ms | 97.9194μs | 10.2125 KOps/s | 9.9040 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5471ms | 0.3984ms | 2.5098 KOps/s | 2.4301 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2575ms | 0.1358ms | 7.3639 KOps/s | 7.1277 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1370ms | 19.7696μs | 50.5827 KOps/s | 55.2607 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 67.1510μs | 31.1572μs | 32.0953 KOps/s | 32.0279 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1097ms | 71.0372μs | 14.0771 KOps/s | 14.2830 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1768ms | 51.3972μs | 19.4563 KOps/s | 19.3539 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6044ms | 0.3867ms | 2.5858 KOps/s | 2.2636 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7781ms | 2.5943ms | 385.4648 Ops/s | 386.0107 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5486ms | 0.3717ms | 2.6901 KOps/s | 2.2651 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7760ms | 2.5849ms | 386.8608 Ops/s | 387.5334 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5498ms | 0.1169ms | 8.5569 KOps/s | 8.5266 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5672ms | 79.1985μs | 12.6265 KOps/s | 11.9905 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.6413ms | 0.1056ms | 9.4714 KOps/s | 9.2425 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2474ms | 70.7042μs | 14.1434 KOps/s | 13.9386 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2859ms | 0.1118ms | 8.9455 KOps/s | 9.1241 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2427ms | 70.3664μs | 14.2113 KOps/s | 14.0967 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2674ms | 99.5632μs | 10.0439 KOps/s | 9.6402 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1696ms | 17.2115μs | 58.1007 KOps/s | 49.3295 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2219ms | 96.9083μs | 10.3190 KOps/s | 10.2574 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1196ms | 15.7886μs | 63.3370 KOps/s | 63.9805 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2319ms | 94.5386μs | 10.5777 KOps/s | 9.9028 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 54.4610μs | 15.6836μs | 63.7610 KOps/s | 63.7910 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2412ms | 0.1000ms | 9.9978 KOps/s | 9.6518 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5671ms | 16.8956μs | 59.1869 KOps/s | 57.3021 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1575ms | 95.3410μs | 10.4887 KOps/s | 10.0033 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1652ms | 15.6489μs | 63.9023 KOps/s | 63.4658 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1942ms | 94.3798μs | 10.5955 KOps/s | 10.3115 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1634ms | 15.9662μs | 62.6322 KOps/s | 63.4240 KOps/s | |
test_mod_add[eager] | 0.1775ms | 36.5889μs | 27.3307 KOps/s | 25.3299 KOps/s | |
test_mod_add[compile] | 0.1493ms | 81.6171μs | 12.2523 KOps/s | 12.4111 KOps/s | |
test_mod_add[compile-overhead] | 0.3265ms | 0.1662ms | 6.0162 KOps/s | 5.7808 KOps/s | |
test_mod_wrap[eager] | 0.3302ms | 0.2576ms | 3.8819 KOps/s | 3.9788 KOps/s | |
test_mod_wrap[compile] | 0.3686ms | 0.2906ms | 3.4412 KOps/s | 3.4041 KOps/s | |
test_mod_wrap[compile-overhead] | 7.3998ms | 3.7227ms | 268.6230 Ops/s | 273.2742 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5333ms | 1.3527ms | 739.2779 Ops/s | 700.6771 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.8437ms | 1.3707ms | 729.5306 Ops/s | 787.7796 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.5006ms | 1.0221ms | 978.3997 Ops/s | 1.0873 KOps/s | |
test_seq_add[eager] | 0.2623ms | 0.1130ms | 8.8525 KOps/s | 8.3688 KOps/s | |
test_seq_add[compile] | 0.3277ms | 89.4615μs | 11.1780 KOps/s | 11.4572 KOps/s | |
test_seq_add[compile-overhead] | 0.2855ms | 0.1278ms | 7.8245 KOps/s | 7.7804 KOps/s | |
test_seq_wrap[eager] | 0.5394ms | 0.4065ms | 2.4601 KOps/s | 2.3244 KOps/s | |
test_seq_wrap[compile] | 0.4441ms | 0.2987ms | 3.3476 KOps/s | 3.3167 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3103ms | 0.2215ms | 4.5154 KOps/s | 4.4720 KOps/s | |
test_func_call_runtime[False-eager] | 0.8590ms | 0.7106ms | 1.4073 KOps/s | 1.3746 KOps/s | |
test_func_call_runtime[False-compile] | 0.8399ms | 0.7382ms | 1.3547 KOps/s | 1.3455 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4812ms | 0.3578ms | 2.7952 KOps/s | 2.7760 KOps/s | |
test_func_call_runtime[True-eager] | 0.9560ms | 0.8806ms | 1.1356 KOps/s | 1.1057 KOps/s | |
test_func_call_runtime[True-compile] | 0.8600ms | 0.7594ms | 1.3168 KOps/s | 1.3182 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4953ms | 0.3795ms | 2.6347 KOps/s | 2.6247 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8711ms | 0.7089ms | 1.4107 KOps/s | 1.3717 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8942ms | 0.7448ms | 1.3427 KOps/s | 1.2772 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5079ms | 0.3596ms | 2.7810 KOps/s | 2.7505 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1304ms | 0.9795ms | 1.0209 KOps/s | 996.5943 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9364ms | 0.7896ms | 1.2665 KOps/s | 1.2126 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4762ms | 0.4066ms | 2.4595 KOps/s | 2.3759 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5162ms | 2.0278ms | 493.1444 Ops/s | 484.4594 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.2808ms | 0.8520ms | 1.1737 KOps/s | 1.2181 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5248ms | 0.4099ms | 2.4399 KOps/s | 2.4432 KOps/s | |
test_distributed | 5.7550ms | 0.2926ms | 3.4181 KOps/s | 7.8904 KOps/s | |
test_tdmodule | 0.1227ms | 19.4413μs | 51.4368 KOps/s | 47.0504 KOps/s | |
test_tdmodule_dispatch | 72.4410μs | 34.5589μs | 28.9361 KOps/s | 25.7622 KOps/s | |
test_tdseq | 33.9110μs | 19.5684μs | 51.1027 KOps/s | 44.0314 KOps/s | |
test_tdseq_dispatch | 57.4410μs | 36.3246μs | 27.5295 KOps/s | 23.2164 KOps/s | |
test_instantiation_functorch | 1.6350ms | 1.5503ms | 645.0447 Ops/s | 638.9040 Ops/s | |
test_exec_functorch | 0.2042ms | 0.1398ms | 7.1552 KOps/s | 7.2119 KOps/s | |
test_exec_functional_call | 0.2539ms | 0.1291ms | 7.7434 KOps/s | 7.7231 KOps/s | |
test_exec_td_decorator | 0.3663ms | 0.1775ms | 5.6342 KOps/s | 5.5852 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8076ms | 0.6685ms | 1.4959 KOps/s | 1.4746 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8501ms | 0.6679ms | 1.4973 KOps/s | 1.4728 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7269ms | 0.5816ms | 1.7193 KOps/s | 1.6376 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7266ms | 0.5825ms | 1.7167 KOps/s | 1.6333 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.7432ms | 18.6491ms | 53.6220 Ops/s | 53.1729 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 18.9432ms | 18.7436ms | 53.3516 Ops/s | 53.1372 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.8008ms | 18.6041ms | 53.7516 Ops/s | 53.4002 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.6392ms | 18.5735ms | 53.8401 Ops/s | 53.4868 Ops/s | |
test_to_module_speed[True] | 1.1066ms | 0.9856ms | 1.0146 KOps/s | 1.0157 KOps/s | |
test_to_module_speed[False] | 1.4400ms | 0.9613ms | 1.0403 KOps/s | 1.0510 KOps/s | |
test_tc_init | 0.1138ms | 35.9557μs | 27.8120 KOps/s | 26.4954 KOps/s | |
test_tc_init_nested | 0.3063ms | 71.6448μs | 13.9577 KOps/s | 12.8847 KOps/s | |
test_tc_first_layer_tensor | 20.9100μs | 0.8067μs | 1.2396 MOps/s | 1.4661 MOps/s | |
test_tc_first_layer_nontensor | 22.3900μs | 2.2694μs | 440.6385 KOps/s | 443.7174 KOps/s | |
test_tc_second_layer_tensor | 32.2352μs | 1.3848μs | 722.1416 KOps/s | 696.1258 KOps/s | |
test_tc_second_layer_nontensor | 26.6810μs | 3.0132μs | 331.8716 KOps/s | 331.3166 KOps/s | |
test_unbind | 0.2186s | 9.8724ms | 101.2930 Ops/s | 142.7024 Ops/s | |
test_full_like | 12.0812ms | 9.2154ms | 108.5135 Ops/s | 106.7814 Ops/s | |
test_zeros_like | 9.3354ms | 7.2564ms | 137.8091 Ops/s | 115.0167 Ops/s | |
test_ones_like | 5.2037ms | 4.3280ms | 231.0520 Ops/s | 232.3958 Ops/s | |
test_clone | 11.2816ms | 8.9988ms | 111.1255 Ops/s | 158.8522 Ops/s | |
test_squeeze | 0.1640ms | 9.7360μs | 102.7121 KOps/s | 106.2243 KOps/s | |
test_unsqueeze | 0.2258ms | 76.4581μs | 13.0791 KOps/s | 13.6609 KOps/s | |
test_split | 0.3699ms | 0.1594ms | 6.2750 KOps/s | 6.2141 KOps/s | |
test_permute | 0.2996ms | 0.1845ms | 5.4187 KOps/s | 5.7579 KOps/s | |
test_stack | 53.3321ms | 53.0161ms | 18.8622 Ops/s | 19.5305 Ops/s | |
test_cat | 53.2035ms | 52.8701ms | 18.9143 Ops/s | 19.9802 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):